308 research outputs found

    Bayesian DNA copy number analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Some diseases, like tumors, can be related to chromosomal aberrations, leading to changes of DNA copy number. The copy number of an aberrant genome can be represented as a piecewise constant function, since it can exhibit regions of deletions or gains. Instead, in a healthy cell the copy number is two because we inherit one copy of each chromosome from each our parents.</p> <p>Bayesian Piecewise Constant Regression (BPCR) is a Bayesian regression method for data that are noisy observations of a piecewise constant function. The method estimates the unknown segment number, the endpoints of the segments and the value of the segment levels of the underlying piecewise constant function. The Bayesian Regression Curve (BRC) estimates the same data with a smoothing curve. However, in the original formulation, some estimators failed to properly determine the corresponding parameters. For example, the boundary estimator did not take into account the dependency among the boundaries and succeeded in estimating more than one breakpoint at the same position, losing segments.</p> <p>Results</p> <p>We derived an improved version of the BPCR (called mBPCR) and BRC, changing the segment number estimator and the boundary estimator to enhance the fitting procedure. We also proposed an alternative estimator of the variance of the segment levels, which is useful in case of data with high noise. Using artificial data, we compared the original and the modified version of BPCR and BRC with other regression methods, showing that our improved version of BPCR generally outperformed all the others. Similar results were also observed on real data.</p> <p>Conclusion</p> <p>We propose an improved method for DNA copy number estimation, mBPCR, which performed very well compared to previously published algorithms. In particular, mBPCR was more powerful in the detection of the true position of the breakpoints and of small aberrations in very noisy data. Hence, from a biological point of view, our method can be very useful, for example, to find targets of genomic aberrations in clinical cancer samples.</p

    R-Gada: a fast and flexible pipeline for copy number analysis in association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genome regions for some disease conditions where simple genetic variation (i.e., SNPs) has previously failed to provide a clear association.</p> <p>Results</p> <p>Here we present a new R package, that integrates: (i) data import from most common formats of Affymetrix, Illumina and aCGH arrays; (ii) a fast and accurate segmentation algorithm to call CNVs based on Genome Alteration Detection Analysis (GADA); and (iii) functions for displaying and exporting the Copy Number calls, identification of recurrent CNVs, multivariate analysis of population structure, and tools for performing association studies. Using a large dataset containing 270 HapMap individuals (Affymetrix Human SNP Array 6.0 Sample Dataset) we demonstrate a flexible pipeline implemented with the package. It requires less than one minute per sample (3 million probe arrays) on a single core computer, and provides a flexible parallelization for very large datasets. Case-control data were generated from the HapMap dataset to demonstrate a GWAS analysis.</p> <p>Conclusions</p> <p>The package provides the tools for creating a complete integrated pipeline from data normalization to statistical association. It can effciently handle a massive volume of data consisting of millions of genetic markers and hundreds or thousands of samples with very accurate results.</p

    On the Adaptive Partition Approach to the Detection of Multiple Change-Points

    Get PDF
    With an adaptive partition procedure, we can partition a “time course” into consecutive non-overlapped intervals such that the population means/proportions of the observations in two adjacent intervals are significantly different at a given level . However, the widely used recursive combination or partition procedures do not guarantee a global optimization. We propose a modified dynamic programming algorithm to achieve a global optimization. Our method can provide consistent estimation results. In a comprehensive simulation study, our method shows an improved performance when it is compared to the recursive combination/partition procedures. In practice, can be determined based on a cross-validation procedure. As an application, we consider the well-known Pima Indian Diabetes data. We explore the relationship among the diabetes risk and several important variables including the plasma glucose concentration, body mass index and age

    Genomic aberrations relate early and advanced stage ovarian cancer

    Get PDF
    Background Because of the distinct clinical presentation of early and advanced stage ovarian cancer, we aim to clarify whether these disease entities are solely separated by time of diagnosis or whether they arise from distinct molecular events. Methods Sixteen early and sixteen advanced stage ovarian carcinomas, matched for histological subtype and differentiation grade, were included. Genomic aberrations were compared for each early and advanced stage ovarian cancer by array comparative genomic hybridization. To study how the aberrations correlate to the clinical characteristics of the tumors we clustered tumors based on the genomic aberrations. Results The genomic aberration patterns in advanced stage cancer equalled those in early stage, but were more frequent in advanced stage (p=0.012). Unsupervised clustering based on genomic aberrations yielded two clusters that significantly discriminated early from advanced stage (p= 0.001), and that did differ significantly in survival (p= 0.002). These clusters however did give a more accurate prognosis than histological subtype or differentiation grade. Conclusion This study indicates that advanced stage ovarian cancer either progresses from early stage or from a common precursor lesion but that they do not arise from distinct carcinogenic molecular events. Furthermore, we show that array comparative genomic hybridization has the potential to identify clinically distinct patients

    A response to Yu et al. "A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP) array", BMC Bioinformatics 2007, 8: 145

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Yu et al. (BMC Bioinformatics 2007,8: 145+) have recently compared the performance of several methods for the detection of genomic amplification and deletion breakpoints using data from high-density single nucleotide polymorphism arrays. One of the methods compared is our non-homogenous Hidden Markov Model approach. Our approach uses Markov Chain Monte Carlo for inference, but Yu et al. ran the sampler for a severely insufficient number of iterations for a Markov Chain Monte Carlo-based method. Moreover, they did not use the appropriate reference level for the non-altered state.</p> <p>Methods</p> <p>We rerun the analysis in Yu et al. using appropriate settings for both the Markov Chain Monte Carlo iterations and the reference level. Additionally, to show how easy it is to obtain answers to additional specific questions, we have added a new analysis targeted specifically to the detection of breakpoints.</p> <p>Results</p> <p>The reanalysis shows that the performance of our method is comparable to that of the other methods analyzed. In addition, we can provide probabilities of a given spot being a breakpoint, something unique among the methods examined.</p> <p>Conclusion</p> <p>Markov Chain Monte Carlo methods require using a sufficient number of iterations before they can be assumed to yield samples from the distribution of interest. Running our method with too small a number of iterations cannot be representative of its performance. Moreover, our analysis shows how our original approach can be easily adapted to answer specific additional questions (e.g., identify edges).</p

    Glioblastoma Subclasses Can Be Defined by Activity among Signal Transduction Pathways and Associated Genomic Alterations

    Get PDF
    Glioblastoma multiforme (GBM) is an umbrella designation that includes a heterogeneous group of primary brain tumors. Several classification strategies of GBM have been reported, some by clinical course and others by resemblance to cell types either in the adult or during development. From a practical and therapeutic standpoint, classifying GBMs by signal transduction pathway activation and by mutation in pathway member genes may be particularly valuable for the development of targeted therapies.We performed targeted proteomic analysis of 27 surgical glioma samples to identify patterns of coordinate activation among glioma-relevant signal transduction pathways, then compared these results with integrated analysis of genomic and expression data of 243 GBM samples from The Cancer Genome Atlas (TCGA). In the pattern of signaling, three subclasses of GBM emerge which appear to be associated with predominance of EGFR activation, PDGFR activation, or loss of the RAS regulator NF1. The EGFR signaling class has prominent Notch pathway activation measured by elevated expression of Notch ligands, cleaved Notch receptor, and downstream target Hes1. The PDGF class showed high levels of PDGFB ligand and phosphorylation of PDGFRbeta and NFKB. NF1-loss was associated with lower overall MAPK and PI3K activation and relative overexpression of the mesenchymal marker YKL40. These three signaling classes appear to correspond with distinct transcriptomal subclasses of primary GBM samples from TCGA for which copy number aberration and mutation of EGFR, PDGFRA, and NF1 are signature events.Proteomic analysis of GBM samples revealed three patterns of expression and activation of proteins in glioma-relevant signaling pathways. These three classes are comprised of roughly equal numbers showing either EGFR activation associated with amplification and mutation of the receptor, PDGF-pathway activation that is primarily ligand-driven, or loss of NF1 expression. The associated signaling activities correlating with these sentinel alterations provide insight into glioma biology and therapeutic strategies

    The AURORA pilot study for molecular screening of patients with advanced breast cancer–a study of the breast international group

    Get PDF
    Several studies have demonstrated the feasibility of molecular screening of tumour samples for matching patients with cancer to targeted therapies. However, most of them have been carried out at institutional or national level. Herein, we report on the pilot phase of AURORA (NCT02102165), a European multinational collaborative molecular screening initiative for advanced breast cancer patients. Forty-one patients were prospectively enroled at four participating centres across Europe. Metastatic tumours were biopsied and profiled using an Ion Torrent sequencing platform at a central facility. Sequencing results were obtained for 63% of the patients in real-time with variable turnaround time stemming from delays between patient consent and biopsy. At least one clinically actionable mutation was identified in 73% of patients. We used the Illumina sequencing technology for orthogonal validation and achieved an average of 66% concordance of substitution calls per patient. Additionally, copy number aberrations inferred from the Ion Torrent sequencing were compared to single nucleotide polymorphism arrays and found to be 59% concordant on average. Although this study demonstrates that powerful next generation genomic techniques are logistically ready for international molecular screening programs in routine clinical settings, technical challenges remain to be addressed in order to ensure the accuracy and clinical utility of the genomic data.info:eu-repo/semantics/publishe

    A new classification method using array Comparative Genome Hybridization data, based on the concept of Limited Jumping Emerging Patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Classification using aCGH data is an important and insufficiently investigated problem in bioinformatics. In this paper we propose a new classification method of DNA copy number data based on the concept of limited Jumping Emerging Patterns. We present the comparison of our limJEPClassifier to SVM which is considered the most successful classifier in the case of high-throughput data.</p> <p>Results</p> <p>Our results revealed that the classification performance using limJEPClassifier is significantly higher than other methods. Furthermore, we show that application of the limited JEP's can significantly improve classification, when strongly unbalanced data are given.</p> <p>Conclusion</p> <p>Nowadays, aCGH has become a very important tool, used in research of cancer or genomic disorders. Therefore, improving classification of aCGH data can have a great impact on many medical issues such as the process of diagnosis and finding disease-related genes. The performed experiment shows that the application of Jumping Emerging Patterns can be effective in the classification of high-dimensional data, including these from aCGH experiments.</p

    Intra-tumour genetic heterogeneity and poor chemoradiotherapy response in cervical cancer

    Get PDF
    Background: Intra-tumour genetic heterogeneity has been reported in both leukaemias and solid tumours and is implicated in the development of drug resistance in CML and AML. The role of genetic heterogeneity in drug response in solid tumours is unknown. Methods: To investigate intra-tumour genetic heterogeneity and chemoradiation response in advanced cervical cancer, we analysed 10 cases treated on the CTCR-CE01 clinical study. Core biopsies for molecular profiling were taken from four quadrants of the cervix pre-treatment, and weeks 2 and 5 of treatment. Biopsies were scored for cellularity and profiled using Agilent 180k human whole genome CGH arrays. We compared genomic profiles from 69 cores from 10 patients to test for genetic heterogeneity and treatment effects at weeks 0, 2 and 5 of treatment. Results: Three patients had two or more distinct genetic subpopulations pre-treatment. Subpopulations within each tumour showed differential responses to chemoradiotherapy. In two cases, there was selection for a single intrinsically resistant subpopulation that persisted at detectable levels after 5 weeks of chemoradiotherapy. Phylogenetic analysis reconstructed the order in which genomic rearrangements occurred in the carcinogenesis of these tumours and confirmed gain of 3q and loss of 11q as early events in cervical cancer progression. Conclusion: Selection effects from chemoradiotherapy cause dynamic changes in genetic subpopulations in advanced cervical cancers, which may explain disease persistence and subsequent relapse. Significant genetic heterogeneity in advanced cervical cancers may therefore be predictive of poor outcome

    A comparison of genomic copy number calls by Partek Genomics Suite, Genotyping Console and Birdsuite algorithms to quantitative PCR

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Copy number variants are >1 kb genomic amplifications or deletions that can be identified using array platforms. However, arrays produce substantial background noise that contributes to high false discovery rates of variants. We hypothesized that quantitative PCR could finitely determine copy number and assess the validity of calling algorithms.</p> <p>Results</p> <p>Using data from 29 Affymetrix SNP 6.0 arrays, we determined copy numbers using three programs: Partek Genomics Suite, Affymetrix Genotyping Console 2.0 and Birdsuite. We compared array calls at 25 chromosomal regions to those determined by qPCR and found nearly identical calls in regions of copy number 2. Conversely, agreement differed in regions called variant by at least one method. The highest overall agreement in calls, 91%, was between Birdsuite and quantitative PCR. Partek Genomics Suite calls agreed with quantitative PCR 76% of the time while the agreement of Affymetrix Genotyping Console 2.0 with quantitative PCR was 79%.</p> <p>Conclusions</p> <p>In 38 independent samples, 96% of Birdsuite calls agreed with quantitative PCR. Analysis of three copy number calling programs and quantitative PCR showed Birdsuite to have the greatest agreement with quantitative PCR.</p
    corecore